Approximate String Matching Using a Bidirectional Index

نویسندگان

  • Gregory Kucherov
  • Kamil Salikhov
  • Dekel Tsur
چکیده

We study strategies of approximate pattern matching that exploit bidirectional text indexes, extending and generalizing ideas of [6]. We introduce a formalism, called search schemes, to specify search strategies of this type, then develop a probabilistic measure for the efficiency of a search scheme, prove several combinatorial results on efficient search schemes, and finally, provide experimental computations supporting the superiority of our strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FAMOUS: Fast Approximate string Matching using OptimUm search Schemes

Finding approximate occurrences of a pattern in a text using a full-text index is a central problem in bioinformatics and has been extensively researched. The introduction of practical bidirectional indices has opened new possibilities for solving the problem as they allow the search to be started from anywhere within the pattern and extended in both directions. In particular, use of search sch...

متن کامل

n-Gram/2L-approximation: a two-level n-gram inverted index structure for approximate string matching

Approximate string matching is to find all the occurrences of a query string in a text database allowing a specified number of errors. Approximate string matching based on the n-gram inverted index (simply, n-gram Matching) has been widely used. A major reason is that it is scalable for large databases since it is not a main memory algorithm. Nevertheless, n-gram Matching also has drawbacks: th...

متن کامل

Faster Filters for Approximate String Matching

We introduce a new filtering method for approximate string matching called the suffix filter. It has some similarity with well-known filtration algorithms, which we call factor filters, and which are among the best practical algorithms for approximate string matching using a text index. Suffix filters are stronger, i.e., produce fewer false matches than factor filters. We demonstrate experiment...

متن کامل

A New Indexing Methodfor Approximate String Matching ? Gonzalo

We present a new indexing method for the approximate string matching problem. The method is based on a suux tree combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the retrieval time is O(n), for 0 < < 1, whenever < 1 ? e= p , where is the error level tolerated and is the alphabet size. We experimentally show that this index outperforms by far all othe...

متن کامل

Approximate String Matching

We present a new indexing method for the approximate string matching problem. The method is based on a suux tree combined with a partitioning of the pattern. We analyze the resulting algorithm and show that the retrieval time is O(n), for 0 < < 1, whenever < 1 ? e= p , where is the error level tolerated and is the alphabet size. We experimentally show that this index outperforms by far all othe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 638  شماره 

صفحات  -

تاریخ انتشار 2014